home *** CD-ROM | disk | FTP | other *** search
- Xref: bloom-picayune.mit.edu alt.binaries.sounds.misc:3804 alt.binaries.sounds.d:1997 comp.dsp:4886 news.answers:4658
- Path: bloom-picayune.mit.edu!enterpoop.mit.edu!ira.uka.de!math.fu-berlin.de!news.netmbx.de!Germany.EU.net!mcsun!sun4nl!cwi.nl!guido
- From: guido@cwi.nl (Guido van Rossum)
- Newsgroups: alt.binaries.sounds.misc,alt.binaries.sounds.d,comp.dsp,news.answers
- Subject: FAQ: Audio File Formats (version 2.9)
- Message-ID: <audio-fmts_724600938@charon.cwi.nl>
- Date: 17 Dec 92 14:02:32 GMT
- Expires: 14 Jan 93 14:02:18 GMT
- Sender: news@cwi.nl
- Reply-To: guido@cwi.nl
- Followup-To: alt.binaries.sounds.d,comp.dsp
- Lines: 1518
- Approved: news-answers-request@MIT.Edu
- Supersedes: <audio-fmts_722270826@charon.cwi.nl>
-
- Archive-name: audio-fmts/part1
- Submitted-by: Guido van Rossum <guido@cwi.nl>
- Version: 2.9
- Last-modified: 17-Dec-1992
-
- FAQ: Audio File Formats (version 2.9)
- =====================================
-
- Table of contents
- -----------------
-
- Introduction
- Device characteristics
- Popular sampling rates
- Compression schemes
- Current hardware
- File formats
- File conversions
- Playing audio files on UNIX
- Playing audio files on micros
- The Sound Site Newsletter
- Posting sounds
-
- Appendices:
-
- FTP access for non-internet sites
- AIFF Format (Audio IFF)
- The NeXT/Sun audio file format
- IFF/8SVX Format
- Playing sound on a PC
- The EA-IFF-85 documentation
- US Federal Standard 1016 availability
- Creative Voice (VOC) file format
- RIFF WAVE (.WAV) file format
- U-LAW and A-LAW definitions
- AVR File Format
-
-
- Introduction
- ------------
-
- This is version 2 of this FAQ, which I started in November 1991 under
- the name "The audio formats guide". I bumped the major version number
- since the Subject and Newsgroups headers have changed to make the
- subject more informative and give the guide a wider audience. I also
- added a Table of contents section at the top.
-
- I am posting this about once a fortnight, either unchanged (just to
- inform new readers), or updated (if I learn more or when new hardware
- or software becomes popular). I post to alt.binaries.sounds.{misc,d}
- and to comp.dsp, for maximal coverage of people interested in audio,
- and to news.answers, for easy reference.
-
- A companion posting with subject "Change to: ..." is occasionally
- posted listing the diffs between a new version and the last. This is
- not reposted, and it is suppressed when the diffs are bigger than the
- new version.
-
- NEWSFLASH: This FAQ is now also available in distributed hypertext
- form! If you have a WWW browser and direct Internet access you can
- point it to "http://voorn.cwi.nl/audio-formats/a00.html". (WWW is the
- CERN World-Wide Web initiative; for more info, telnet or ftp to
- info.cern.ch.)
-
- Send updates, comments and questions to <guido@cwi.nl>; flames to
- /dev/null.
-
- I'd like to thank everyone who sent me mail with updates for previous
- versions. The list of names is really too long to list you all...
-
- --Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
- "Lobster thermidor aux crevettes with a mornay sauce garnished with
- truffle pate, brandy and a fried egg on top and spam"
-
-
- Device characteristics
- ----------------------
-
- In this text, I will only use the term "sample" to refer to a single
- output value from an A/D converter, i.e., a small integer number
- (usually 8 or 16 bits).
-
- Audio data is characterized by the following parameters, which
- correspond to settings of the A/D converter when the data was
- recorded. Naturally, the same settings must be used to play the data.
-
- - sampling rate (in samples per second), e.g. 8000 or 44100
-
- - number of bits per sample, e.g. 8 or 16
-
- - number of channels (1 for mono, 2 for stereo, etc.)
-
- Approximate sampling rates are often quoted in Hz or kHz ([kilo-]
- Hertz), however, the politically correct term is samples per second
- (samples/sec). Sampling rates are always measured per channel, so for
- stereo data recorded at 8000 samples/sec, there are actually 16000
- samples in a second. I will sometimes write 8 k as a shorthand for
- 8000 samples/sec.
-
- Multi-channel samples are generally interleaved on a frame-by-frame
- basis: if there are N channels, the data is a sequence of frames,
- where each frame contains N samples, one from each channel. (Thus,
- the sampling rate is really the number of *frames* per second.) For
- stereo, the left channel usually comes first.
-
- The specification of the number of bits for U-LAW (pronounced mu-law
- -- the u really stands for the Greek letter mu) samples is somewhat
- problematic. These samples are logarithmically encoded in 8 bits,
- like a tiny floating point number; however, their dynamic range is
- that of 14 bit linear data. Source for converting to/from U-LAW
- (written by Jef Poskanzer) is distributed as part of the SOX package
- mentioned below; it can easily be ripped apart to serve in other
- applications. The official definition is the CCITT standard G.711.
-
- There exists another encoding similar to U-LAW, called A-LAW, which
- is used as a European telephony standard. There is less support for
- it in UNIX workstations.
-
- (See the Appendix for some formulae describing U-LAW and A-LAW.)
-
-
- Popular sampling rates
- ----------------------
-
- Some sampling rates are more popular than others, for various reasons.
- Some recording hardware is restricted to (approximations of) some of
- these rates, some playback hardware has direct support for some. The
- popularity of divisors of common rates can be explained by the
- simplicity of clock frequency dividing circuits :-).
-
- Samples/sec Description
-
- 5500 One fourth of the Mac sampling rate (rarely seen).
-
- 7333 One third of the Mac sampling rate (rarely seen).
-
- 8000 Exactly 8000 samples/sec is a telephony standard that
- goes together with U-LAW (and also A-LAW) encoding.
- Some systems use an slightly different rate; in
- particular, the NeXT workstation uses 8012.8210513,
- apparently the rate used by Telco CODECs.
-
- 11 k Either 11025, a quarter of the CD sampling rate,
- or half the Mac sampling rate (perhaps the most
- popular rate on the Mac).
-
- 16000 Used by, e.g. the G.722 compression standard.
-
- 18.9 k CD-ROM/XA standard.
-
- 22 k Either 22050, half the CD sampling rate, or the Mac
- rate; the latter is precisely 22254.545454545454 but
- usually misquoted as 22000. (Historical note:
- 22254.5454... was the horizontal scan rate of the
- original 128k Mac.)
-
- 32000 Used in digital radio, NICAM (Nearly-Instantaneous
- Companded Audio Multiplex [IBA/BREMA/BBC]) and other
- TV work, at least in the UK; also long play DAT and
- Japanese HDTV.
-
- 37.8 k CD-ROM/XA standard for higher quality.
-
- 44056 This weird rate is used by professional audio
- equipment to fit an integral number of samples in a
- video frame.
-
- 44100 The CD sampling rate. (DAT players recording
- digitally from CD also use this rate.)
-
- 48000 The DAT (Digital Audio Tape) sampling rate for
- domestic use.
-
- Files samples on SoundBlaster hardware have sampling rates that are
- divisors of 1000000.
-
- While professinal musicians disagree, most people don't have a problem
- if recorded sound is played at a slightly different rate, say, 1-2%.
- On the other hand, if recorded data is being fed into a playback
- device in real time (say, over a network), even the smallest
- difference in sampling rate can frustrate the buffering scheme used...
-
- There may be an emerging tendency to standardize on only a few
- sampling rates and encoding styles, even if the file formats may
- differ. The suggested rates and styles are:
-
- rate (samp/sec) style mono/stereo
-
- 8000 8-bit U-LAW mono
- 22050 8-bit linear unsigned mono and stereo
- 44100 16-bit linear signed mono and stereo
-
-
- Compression schemes
- -------------------
-
- Strange though it seems, audio data is remarkably hard to compress
- effectively. For 8-bit data, a Huffman encoding of the deltas between
- successive samples is relatively successful. For 16-bit data,
- companies like Sony and Philips have spent millions to develop
- proprietary schemes.
-
- Public standards for voice compression are slowly gaining popularity,
- e.g. CCITT G.721 and G.723 (ADPCM at 32 and 24 kbits/sec). (ADPCM ==
- Adaptive Delta Pulse Code Modulation.) Free source code for a *fast*
- 32 kbits/sec ADPCM algorithm is available by ftp from ftp.cwi.nl as
- /pub/adpcm.shar. (** NOTE: if you are using v1.0, you should get
- v1.1, released 17-Dec-1992, which fixes a serious bug -- the quality
- of v1.1 is claimed to be better than uLAW **)
-
- There are also two US federal standards, 1016 (Code excited linear
- prediction (CELP), 4800 bits/s) and 1015 (LPC-10E, 2400 bits/s). See
- also the appendix for 1016.
-
- (Note that U-LAW and silence detection can also be considered
- compression schemes.)
-
- Here's a note about audio codings by Van Jacobson <van@ee.lbl.gov>:
- Several people used the words "LPC" and "CELP" interchangably. They
- are very different. An LPC (Linear Predictive Coding) coder fits
- speech to a simple, analytic model of the vocal tract, then throws
- away the speech & ships the parameters of the best-fit model. An LPC
- decoder uses those parameters to generate synthetic speech that is
- usually more-or-less similar to the original. The result is
- intelligible but sounds like a machine is talking. A CELP (Code
- Excited Linear Predictor) coder does the same LPC modeling but then
- computes the errors between the original speech & the synthetic model
- and transmits both model parameters and a very compressed
- representation of the errors (the compressed representation is an
- index into a 'code book' shared between coders & decoders -- this is
- why it's called "Code Excited"). A CELP coder does much more work
- than an LPC coder (usually about an order of magnitude more) but the
- result is much higher quality speech: The FIPS-1016 CELP we're working
- on is essentially the same quality as the 32Kb/s ADPCM coder but uses
- only 4.8Kb/s (the same as the LPC coder).
-
- Finally, the comp.compression FAQ has some text on the 6:1 audio
- compression scheme used by MPEG (a video compression standard-to-be).
- It's interesting to note that video compression reaches much higher
- ratios (like 26:1). This FAQ is ftp'able from rtfm.mit.edu
- [18.72.1.58] in directory /pub/usenet/news.answers/compression-faq,
- files part1 and part2.
-
- Comp.compression also carries a regular posting "How to uncompress
- anything" by David Lemson <lemson@uiuc.edu>, which (tersely) hints on
- which program you need to uncompress a file whose name ends in .<foo>
- for almost any conceivable <foo>. Ftp'able from ftp.cso.uiuc.edu
- (128.174.5.59) in the directory /doc/pcnet as the file compression.
-
-
- Current hardware
- ----------------
-
- I am aware of the following computer systems that can play back and
- (sometimes) record audio data, with their characteristics. Note that
- for most systems you can also buy "professional" sampling hardware,
- which supports much better quality, e.g. >= 44.1 k 16 bits stereo.
- The characteristics listed here are a rough estimate of the
- capabilities of the basic hardware only (and even here I am on thin
- ice, with systems becoming ever more powerful).
-
- machine bits max sampling rate #output channels
-
- Mac 8 22k 1
- Apple IIgs 8 32k / >70k 8(st)
- PC/Soundblaster v1 8 13k / 22k 1
- PC/Soundblaster v2 8 15k / 44.1k 1
- PC/PAS-16 16 44.1k ?(st)
- Atari ST 8 22k 1
- Atari STe,TT 8 50k 2
- Atari Falcon 030 16 50k 8(st)
- Amiga 8 ~29k 4(st)
- Sun Sparc U-LAW 8k 1
- Sun Sparcst. 10 U-LAW,8,16 48k 1(st)
- NeXT U-LAW,8,16 44.1k 1(st)
- SGI Indigo 8,16 48k 4(st)
- Acorn Archimedes ~U-LAW ~180k 8(st)
- Sony RISC-NEWS 8, 16 37.8k ?(st)
- VAXstation 4000 U-LAW 8k 1
- Tandy 1000/*L* 8-bit 22k 3
-
- 4(st) means "four voices, stereo"; sampling rates xx/yy are
- different recording/playback rates; *L* is any type with 'L' in it.
-
- All these machines can play back sound without additional hardware,
- although the needed software is not always standard; only the Sun,
- NeXT and SGI come with standard sampling hardware (the NeXT only
- samples U-LAW at 8000 samples/sec from the built-in microphone port;
- you need a separate board for other rates).
-
- The new VAXstation 4000 (VLC and model 60) series lets you PLAY audio
- (.au) files, and the package DECsound will let you do the recording.
- In fact, DECsound is given away free with Motif 1.1 and supports the
- VAXstation, Sun SPARCstation, DECvoice, and XMedia audio devices. Sun
- sound files work without change.
-
- The SGI Personal IRIS 4D/30 and 4D/35 have the same capabilities as
- the Indigo.
-
- The new Apple Macs have more powerful audio hardware; the latest
- models have built-in microphones.
-
- Software exists for the PC that can play sound on its 1-bit speaker
- using pulse width modulation (see appendix); the Soundblaster board
- records at rates up to 13 k and plays back up to 22 k (weird
- combination, but that's the way it is).
-
- Here's some info about the newest Atari machine, the Falcon030. This
- machine has stereo 16 bit CODECs and a 32 MHz Motorola 56001 that can
- handle 8 channels of 16 bit audio, up to 50 khz/channel with
- simultaneous playback and record. The Falcon DMA sound engine is also
- compatible with the 8 bit stereo DMA used on the STe and TT. All of
- these systems use signed data.
-
- On the NeXT, the Motorola 56001 DSP chip is programmable and you can
- (in principle) do what you want. The SGI uses the same DSP chip but
- it can't be programmed by users -- SGI prefers to offer it as a shared
- system resource to multiple applications, thus enabling developers to
- program audio with their Audio Library and avoid code modifications
- for execution on future machines with different audio hardware, i.e. a
- different DSP.
-
- The Amiga also has a 6-bit volume, which can be used to produce
- something like a 14-bit output for each voice. The hardware can also
- use one of each voice-pair to modulate the other in FM (period) or AM
- (volume, 6-bits).
-
- The Acorn Archimedes uses a variation on U-LAW with the bit order
- reversed and the sign bit in bit 0. Being a 'minority' architecture,
- Arc owners are quite adept at converting sound/image formats from
- other machines, and it is unlikely that you'll ever encounter sound in
- one of the Arc's own formats (there are several).
-
- CD-I machines form a special category. The following formats are used:
-
- - PCM 44.1 kHz standard CD format
- - ADPCM - Addaptive Delta PCM
- - Level A 37.8 kHz 8-bit
- - Level B 37.8 kHz 4-bit
- - Level C 18.9 kHz 4-bit
-
-
- File formats
- ------------
-
- Historically, almost every type of machine used its own file format
- for audio data, but some file formats are more generally applicable,
- and in general it is possible to define conversions between almost any
- pair of file formats -- sometimes losing information, however.
-
- File formats are a separate issue from device characteristics. There
- are two types of file formats: self-describing formats, where the
- device parameters and encoding are made explicit in some form of
- header, and "raw" formats, where the device parameters and encoding
- are fixed.
-
- Self-describing file formats generally define a family of data
- encodings, where a header fields indicates the particular encoding
- variant used. Headerless formats define a single encoding and usually
- allows no variation in device parameters (except sometimes sampling
- rate, which can be a pain to figure out other than by listening to the
- sample).
-
- The header of self-describing formats contains the parameters of the
- sampling device and sometimes other information (e.g. a
- human-readable description of the sound, or a copyright notice). Most
- headers begin with a simple "magic word". (Some formats do not simply
- define a header format, but may contain chunks of data intermingled
- with chunks of encoding info.) The data encoding defines how the
- actual samples are stored in the file, e.g. signed or unsigned, as
- bytes or short integers, in little-endian or big-endian byte order,
- etc. Strictly spoken, channel interleaving is also part of the
- encoding, although so far I have seen little variation in this area.
-
- Some file formats apply some kind of compression to the data, e.g.
- Huffman encoding, or simple silence deletion.
-
- Here's an overview of popular file formats.
-
- Self-describing file formats
- ----------------------------
-
- extension, name origin variable parameters (fixed; comments)
-
- .au or .snd NeXT, Sun rate, #channels, encoding, info string
- .aif(f), AIFF Apple, SGI rate, #channels, sample width, lots of info
- .aif(f), AIFC Apple, SGI same (extension of AIFF with compression)
- .iff, IFF/8SVX Amiga rate, #channels, instrument info (8 bits)
- .voc Soundblaster rate (8 bits/1 ch; can use silence deletion)
- .wav, WAVE Microsoft rate, #channels, sample width, lots of info
- .sf IRCAM rate, #channels, encoding, info
- none, HCOM Mac rate (8 bits/1 ch; uses Huffman compression)
- none, MIME Internet (see below)
- .mod or .nst Amiga (see below)
-
- Note that the filename extension ".snd" is ambiguous: it can be either
- the self-describing NeXT format or the headerless Mac/PC format, or
- even a headerless Amiga format.
-
- I know nothing for sure about the origin of HCOM files, only that
- there are a lot of them floating around on our system and probably at
- FTP sites over the world. The filenames usually don't have a ".hcom"
- extension, but this is what SOX (see below) uses. The file format
- recognized by SOX includes a MacBinary header, where the file
- type field is "FSSD". The data fork begins with the magic word "HCOM"
- and contains Huffman compressed data; after decompression it it is 8
- bits unsigned data.
-
- IFF/8SVX allows for amplitude contours for sounds (attack/decay/etc).
- Compression is optional (and extensible); volume is variable; author,
- notes and copyright properties; etc.
-
- AIFF, AIFC and WAVE are similar in spirit but allow more freedom in
- encoding style (other than 8 bit/sample), amongst others.
-
- There are other sound formats in use on Amiga by digitizers and music
- programs, such as IFF/SMUS.
-
- Appendices describes the NeXT and VOC formats; pointers to more info
- about AIFF, AIFC, 8SVX and WAVE (which are too complex to describe
- here) are also in appendices.
-
- DEC systems (e.g. DECstation 5000) use a variant of the NeXT format
- that uses little-endian encoding and has a different magic number
- (0x0064732E in little-endian encoding).
-
- Standard file formats used in the CD-I world are IFF but on the disc
- they're in realtime files.
-
- An interesting "interchange format" for audio data is described in the
- proposed Internet Standard "MIME", which describes a family of
- transport encodings and structuring devices for electronic mail. This
- is an extensible format, and initially standardizes a type of audio
- data dubbed "audio/basic", which is 8-bit U-LAW data sampled at 8000
- samples/sec.
-
- Finally, a format that doesn't really belong here are "MOD" files,
- usually with extension ".mod" or ".nst" (on PCs, that is -- on Amigas
- they have a *prefix* of "mod."). These files are short clips of
- sounds with sequencing information. This makes for fairly compact
- files but is limitted to making music with samples of a piano and
- trumpet, etc.
-
- Headerless file formats
- -----------------------
-
- extension origin parameters
- or name
-
- .snd, .fssd Mac, PC variable rate, 1 channel, 8 bits unsigned
- .ul US telephony 8 k, 1 channel, 8 bit "U-LAW" encoding
- .snd? Amiga variable rate, 1 channel, 8 bits signed
-
- It is usually easy to distinguish 8-bit signed formats from unsigned
- by looking at the beginning of the data with 'od -b <file | head';
- since most sounds start with a little bit of silence containing small
- amounts of background noise, the signed formats will have an abundance
- of bytes with values 0376, 0377, 0, 1, 2, while the unsigned formats
- will have 0176, 0177, 0200, 0201, 0202 instead. (Using "od -c" will
- also show any headers that are tacked in front of the file.)
-
- The Apple IIgs records raw data in the same format as the Mac, but
- uses a 0 byte as a terminator; samples with value 0 are replaced by 1.
-
-
- File conversions
- ----------------
-
- SOX
- ---
-
- The most versatile tool for converting between various audio formats
- is SOX ("Sound Exchange"). It can read and write various types of
- audio files, and optionally applies some special effects (e.g. echo,
- channel averaging, or rate conversion).
-
- SOX recognizes all filename extensions listed above except ".snd",
- which would be ambiguous anyway, and ".wav" (but there's a patch, see
- below). Use type ".au" for NeXT ".snd" files. Mac and PC ".snd"
- files are completely described by these parameters:
-
- -t raw -b -u -r 11000
-
- (or -r 22000 or -r 7333 or -r 5500; 11000 seems to be the most common
- rate).
-
- The source for SOX, version 5, was posted to alt.sources, and should
- be widely archived. To save you the trouble of hunting it down, it
- can be gotten by anonymous ftp from wuarchive.wustl.edu, in the
- directory usenet/alt.sources/articles, files 5581.Z through 5585.Z.
- (These files are compressed news articles containing shar files, if
- you hadn't guessed.) I am sure many sites have similar archives, I'm
- just listing one that I know of and which carries a lot of this kind
- of stuff. (Also see the appendix if you don't have Internet access.)
-
- A compressed tar file containing the same version of SOX is available
- by anonymous ftp from ftp.cwi.nl [192.16.184.180], in /pub/sox*.tar.Z.
- You may be able to locate a nearer version using archie!
-
- Ports of SOX:
-
- - The source as posted should compile on any UNIX system with 4-byte
- integers.
-
- - A PC version is available by ftp from ftp.cwi.nl (see above) as
- pub/sox4*.zip; also available from the garbo mail server.
-
- - The latest Amiga SOX (corresponding to version 5) is available via
- anonymous ftp to wuarchive.wustl.edu, files
- systems/amiga/audio/utils/amisox*. (See below for a non-SOX
- solution.)
-
- - Work is currently in progress to get SOX ported to VMS (watch
- comp.os.vms for announcements).
-
- SOX usage hints:
-
- - Often, the filename extension of sound files posted on the net is
- wrong. Don't give up, try a few other possibilities using the
- "-t <type>" option. Remember that the most common file type is
- unsigned bytes, which can be indicated with "-t ub". You'll have to
- guess the proper sampling rate, but often it's 11k or 22k.
-
- - In particular, with SOX version 4 (or earlier), you have to
- specify "-t 8svx" for files with an .iff extension.
-
- - When converting linear samples to U-LAW using the .au type for the
- output file, you must specify "-U" for the output file, otherwise
- you will end up with a file containing a NeXT/Sun header but linear
- samples -- only the NeXT will play such files correctly. Also, you
- must explicitly specify an output sampling rate with "-r 8000".
- (This may seem fixed for most cases in version 5, but it is still
- occasionally necessary, so I'm keeping this warning in.)
-
- Sun Sparc
- ---------
-
- On Sun Sparcs, starting at SunOS 4.1, a program "raw2audio" is
- provided by Sun (in /usr/demo/SOUND -- see below) which takes a raw
- U-LAW file and turns it into a ".au" file by prefixing it with an
- appropriate header.
-
- NeXT
- ----
-
- On NeXTs, you can usually rename .au files to .snd and it'll work like
- a charm, but some .au files lack header info that the NeXT needs.
- This can be fixed by using sndconvert:
-
- sndconvert -c 1 -f 1 -s 8012.8210513 -o nextfile.snd sunfile.au
-
- SGI Indigo and Personal IRIS
- ----------------------------
-
- SGI supports a program sfconvert, similar in spirit to SOX (in
- /usr/sbin in IRIX version 4.0). Also note that the sfplay program
- (see the next section) can do on-the-fly conversion for several
- popular formats.
-
- Amiga
- -----
-
- Mike Cramer's SoundZAP can do no effects except rate change and it
- only does conversions to IFF, but it is generally much faster than
- SOX. (Ftp'able from the same directory as amisox above.)
-
- Tandy
- -----
-
- The Tandy 1000 uses a (proprietary?) compressed format. There is a
- PD Mac to Tandy conversion program called CONVERT.
-
-
- Playing audio files on UNIX
- ---------------------------
-
- The commands needed to play an audio file depend on the file format
- and the available hardware and software. Most systems can only
- directly play sound in their native format; use a conversion program
- (see above) to play other formats.
-
- Sun Sparcstation running SunOS 4.x
- ----------------------------------
-
- Raw U-LAW files can be played using "cat file >/dev/audio".
-
- A whole package for dealing with ".au" files is provided by Sun on an
- experimental basis, in /usr/demo/SOUND. You may have to compile the
- programs first. (If you can't find this directory, either you are not
- running SunOS 4.1 yet, or your system administrator hasn't installed
- it -- go ask him for it, not me!) The program "play" in this
- directory recognizes all files in Sun/NeXT format, but a SS 1 or 2 can
- play only those using U-LAW encoding at 8 k -- the SS 10 hardware
- plays other encodings, too.
-
- If you ca't find "play", you can also cat a ".au" file to /dev/audio,
- if it uses U-LAW; the header will sound like a short burst of noise
- but the rest of the data will sound OK (really, the only difference in
- this case between raw U-LAW and ".au" files is the header; the U-LAW
- data is exactly the same).
-
- Finally, OpenWindows 3.0 has a full-fledged audio tool. You can drop
- audio file icons into it, edit them, etc.
-
- Sun Sparcstation running Solaris 2.0
- ------------------------------------
-
- Under SVR4 (and hence Solaris 2.0), writing to /dev/audio from the
- shell is a bad idea, because the device driver will flush its queue as
- soon as the file is closed. Use "audioplay" instead. The supported
- formats and sampling rates are the same as above.
-
- NeXT
- ----
-
- On NeXT machines, the standard "sndplay" program can play all NeXT
- format files (this include Sun ".au" files). It supports at least
- U-LAW at 8 k and 16 bits samples at 22 or 44.1 k. It attempts
- on-the-fly conversions for other formats.
-
- Sound files are also played if you double-click on them in the file
- browser.
-
- SGI Indigo and Personal IRIS
- ----------------------------
-
- On SGI Indigo and the 4D/30 and /35 Personal IRIS workstations, the
- program "sfplay" (in /usr/sbin) plays AIFF files, if the sampling rate
- is one of 8000, 11025, 16000, 22050, 32000, 44100, or 48000. On the
- Personal IRIS, you need to have the audio board installed (check the
- output from hinv) and you must run IRIX 3.3.2 or 4.0 or higher.
- "Workspace" plays audio files if you double click on them.
-
- There is no simple /dev/audio interface on these SGI machines. (There
- was one on 4D/25 machines, reading and writing signed linear 8-bit
- samples at rates of 8, 16 and 32 k.)
-
- A program "playulaw" was posted as part of the "radio 2.0" release
- that I posted to several source groups recently; it plays raw U-LAW
- files on the Indigo or Personal IRIS audio hardware.
-
- Sony NEWS
- ---------
-
- The Sony RISC-NEWS line (NWS-3250 laptop, NWS-37xx desktop, NWS-38xx
- desktop w/ IOP) also has builtin sound capabilities. You can also buy
- external boards for the older NEWS machines or to add extra channels
- to the new machines. In the default mode (8k/8-bit), Sun .au files
- are directly supported (you can 'cat' .au files to /dev/sb and have
- them play).
-
- Vaxstation 4000
- ---------------
-
- ".au" files can be played by COPYING them to device "SOA0:". This
- device is set up by enabling the driver SODRIVER, as described below:
-
- DEC's sound stuff is like most other new toy. Hardware first, THEN the
- software. DEC will soon be releasing a layered product called DECsound,
- which will let you record, play, and (possibly) manipulate sound files.
- Third party product(s) have ALREADY hit the market.
-
- Enabling SODRIVER: (you can use the following command file)
-
- $!---------------- cut here -------------------------------
- $! sound_setup.com enable SOUND driver
- $ run sys$system:sysgen
- connect soa0 /adapter=0 /csr=%x0e00 /vector=%o304 /driver=sodriver
- exit
- $ exit
- $!----------------- cut here ------------------------------------
-
- The external audio port comes with a telephone-jack-like port. For
- starters, you can plug a telephone RECEIVER right into this port to
- hear your first sound files. After that, you can use the adapter
- (that came with the VaxStation), and plug in a small set of stereo
- speakers (the kind you'd plug into a WALKMAN, for example), for more
- volume.
-
- Others
- ------
-
- Most other UNIX boxes don't have audio hardware and thus can't play
- audio data.
-
-
- Playing audio files on micros
- -----------------------------
-
- Most micros have at least a speaker built in, so theoretically all you
- need is the right software. Unfortunately most systems don't come
- bundled with sound-playing software, so there are many public domain
- or shareware software packages, each with their own bugs and features.
- Most separate sound recording hardware also comes with playing
- software, most of which can play sound (in the file format used by
- that hardware) even on machines that don't have that hardware
- installed.
-
- Chris S. Craig announces the following software for PCs:
-
- ScopeTrax This is a complete PC sound player/editor package. Sounds
- can be played back at ANY rate between 1kHz to 65kHz through
- the PC speaker or the Sound Blaster. It supports several
- file formats including VOC, IFF/8SVX, raw signed and raw
- unsigned. A separate executable is provided to convert
- .au and mu-law to raw format. ScopeTrax requires EGA/VGA
- graphics for editing and displaying sounds on a REALTIME
- oscilloscope. The package also includes:
- * An expanded memory player which can play sounds
- larger than 640K in size.
- * Basic (rough) sound compression/uncompression
- utilities.
- * Complete documentation.
- The package is FREEWARE! It is available on SIMTEL in the
- PD1:[MSDOS.SOUND] directory.
-
- One of the appendices below contains a list of more programs to play
- sound on the PC.
-
- For sounds on Atari STs - programs are in the atari/sound/players
- directory on atari.archive.umich.edu (141.211.164.8).
-
- Malcolm Slaney from Apple writes:
-
- "We do have tools to play sound back on most of our Unix hosts. We wrote
- a program called TcpPlay that lets us read a sound file on a Unix host,
- open a TCP/IP connection to the Mac on my desk, and plays the file. We
- think of it as X windows for sound (at least a step in that direction.)
-
- This software is available for anonymous FTP from ftp.apple.com.
- Look for ~ftp/pub/TcpPlay/TcpPlay.sit.hqx.
-
- Finally, there are MANY tools for working with sound on the Macintosh. Three
- applications that come to mind immediately are SoundEdit (formerly by
- Farralon and now by MacroMind/Paracomp), Alchemy and Eric Keller's Signalyze.
- There are lots of other tools available for sound editing (including some
- of the QuickTime Movie tools.)"
-
- On a Tandy 1000, sounds can be played and recorded with DeskMate Sound
- (SOUND.PDM), or if they not stored in compressed format, they can also
- be played be a program called PLAYSND. No indication of whether
- PLAYSND is PD or not. It hasn't been updated since March of 89.
-
- The Sound Site Newsletter
- -------------------------
-
- An electronic publication with lots of info about digitised sound and
- sound formats, albeit mostly on micros, is "The Sound Site
- Newsletter". So far, 8 issues have appeared, the last in January
- 1992. Issues can be ftp'ed from saffron.inset.com, directory
- directory pub/rogue/newsletters, or from ccb.ucsf.edu,
- Pub/Sound_list/Sound.Newsletters.
-
-
- Posting sounds
- --------------
-
- The newsgroup alt.binaries.sounds.misc is dedicated to postings
- containing sound. (Discussions related to such postings belong in
- alt.binaries.sounds.d.)
-
- There is no set standard for posting sounds; uuencoded files in most
- popular formats are welcome, if split in parts under 50 kBytes. To
- accomodate automatic decoding software (such as the ":decode" command
- of the nn newsreader), please place a part indicator of the form
- (mm/nn) at the end of your subject meaning this is number mm of a
- total of nn part.
-
- It is recommended to post sounds in the format that was used for the
- original recording; conversions to other formats often lose
- information and would do people with identical hardware as the poster
- no favor. For instance, convering 8-bit linear sound to U-LAW loses
- the lower few bits of the data, and rate changing conversions almost
- always add noise. Converting from U-LAW to linear requires expansion
- to 16 bit samples if no information loss is allowed!
-
- U-LAW data is best posted with a NeXT/Sun header.
-
- If you have to post a file in a headerless format (usually 8-bit
- linear, like ".snd"), please add a description giving at least the
- sampling rate and whether the bytes are signed (zero at 0) or unsigned
- (zero at 0200). However, it is highly recommended to add a header
- that indicates the sampling rate and encoding scheme; if necessary you
- can use SOX to add a header of your choice to raw data.
-
- Compression of sound files usually isn't worth it; the standard
- "compress" algorithm doesn't save much when applied to sound data
- (typically at most 10-20 percent), and compression algorithms
- specifically designed for sound (e.g. NeXT's) are usually
- proprietary. (See also the section "Compression schemes" earlier.)
-
-
- Appendices
- ==========
-
- Here are some more detailed pieces of info that I received by e-mail.
- They are reproduced here virtually without much editing.
-
- ------------------------------------------------------------------------
- FTP access for non-internet sites
- ---------------------------------
-
- From the sci.space FAQ:
-
- Sites not connected to the Internet cannot use FTP directly, but
- there are a few automated FTP servers which operate via email.
- Send mail containing only the word HELP to ftpmail@decwrl.dec.com
- or bitftp@pucc.princeton.edu, and the servers will send you
- instructions on how to make requests
-
- Also:
-
- FAQ lists are available by anonymous FTP from pit-manager.mit.edu
- (18.72.1.58) and by email from mail-server@pit-manager.mit.edu (send
- a message containing "help" for instructions about the mail server).
-
-
- ------------------------------------------------------------------------
- AIFF Format (Audio IFF) and AIFC
- --------------------------------
-
- This format was developed by Apple for storing high-quality sampled
- sound and musical instrument info; it is also used by SGI and several
- professional audio packages (sorry, I know no names). An extension,
- called AIFC or AIFF-C, supports compression (see the last item below).
-
- I've made a BinHex'ed MacWrite version of the AIFF spec (no idea if
- it's the same text as mentioned below) available by anonymous ftp from
- ftp.cwi.nl [192.16.184.180]; the file is /pub/AudioIFF1.2.hqx. But
- you may be better off with the AIFF-C specs, see below.
-
- Mike Brindley (brindley@ece.orst.edu) writes:
-
- "The complete AIFF spec by Steve Milne, Matt Deatherage (Apple) is
- available in 'AMIGA ROM Kernal Reference Manual: Devices (3rd Edition)'
- 1991 by Commodore-Amiga, Inc.; Addison-Wesley Publishing Co.;
- ISBN 0-201-56775-X, starting on page 435 (this edition has a charcoal
- grey cover). It is available in most bookstores, and soon in many
- good librairies."
-
- Finally, Mark Callow writes (in comp.sys.sgi):
-
- "I have placed a PostScript version of the AIFF-C specification on
- sgi.sgi.com for public ftp. It is in the file sgi/aiff-c.9.26.91.ps.
-
- sgi.sgi.com's internet host number is (I think) 192.48.153.1."
-
- ------------------------------------------------------------------------
- The NeXT/Sun audio file format
- ------------------------------
-
- Here's the complete story on the file format, from the NeXT
- documentation. (Note that the "magic" number is ((int)0x2e736e64),
- which equals ".snd".) Also, at the end, I've added a litte document
- that someone posted to the net a couple of years ago, that describes
- the format in a bit-by-bit fashion rather than from C.
-
- I received this from Doug Keislar, NeXT Computer. This is also the
- Sun format, except that Sun doesn't recognize as many format codes. I
- added the numeric codes to the table of formats and sorted it.
-
-
- SNDSoundStruct: How a NeXT Computer Represents Sound
-
- The NeXT sound software defines the SNDSoundStruct structure to
- represent sound. This structure defines the soundfile and Mach-O
- sound segment formats and the sound pasteboard type. It's also used
- to describe sounds in Interface Builder. In addition, each instance
- of the Sound Kit's Sound class encapsulates a SNDSoundStruct and
- provides methods to access and modify its attributes.
-
- Basic sound operations, such as playing, recording, and cut-and-paste
- editing, are most easily performed by a Sound object. In many cases,
- the Sound Kit obviates the need for in-depth understanding of the
- SNDSoundStruct architecture. For example, if you simply want to
- incorporate sound effects into an application, or to provide a simple
- graphic sound editor (such as the one in the Mail application), you
- needn't be aware of the details of the SNDSoundStruct. However, if
- you want to closely examine or manipulate sound data you should be
- familiar with this structure.
-
- The SNDSoundStruct contains a header, information that describes the
- attributes of a sound, followed by the data (usually samples) that
- represents the sound. The structure is defined (in
- sound/soundstruct.h) as:
-
- typedef struct {
- int magic; /* magic number SND_MAGIC */
- int dataLocation; /* offset or pointer to the data */
- int dataSize; /* number of bytes of data */
- int dataFormat; /* the data format code */
- int samplingRate; /* the sampling rate */
- int channelCount; /* the number of channels */
- char info[4]; /* optional text information */
- } SNDSoundStruct;
-
-
-
-
- SNDSoundStruct Fields
-
-
-
- magic
-
- magic is a magic number that's used to identify the structure as a
- SNDSoundStruct. Keep in mind that the structure also defines the
- soundfile and Mach-O sound segment formats, so the magic number is
- also used to identify these entities as containing a sound.
-
-
-
-
-
- dataLocation
-
- It was mentioned above that the SNDSoundStruct contains a header
- followed by sound data. In reality, the structure only contains the
- header; the data itself is external to, although usually contiguous
- with, the structure. (Nonetheless, it's often useful to speak of the
- SNDSoundStruct as the header and the data.) dataLocation is used to
- point to the data. Usually, this value is an offset (in bytes) from
- the beginning of the SNDSoundStruct to the first byte of sound data.
- The data, in this case, immediately follows the structure, so
- dataLocation can also be thought of as the size of the structure's
- header. The other use of dataLocation, as an address that locates
- data that isn't contiguous with the structure, is described in
- "Format Codes," below.
-
-
-
-
-
- dataSize, dataFormat, samplingRate, and channelCount
-
- These fields describe the sound data.
-
- dataSize is its size in bytes (not including the size of the
- SNDSoundStruct).
-
- dataFormat is a code that identifies the type of sound. For sampled
- sounds, this is the quantization format. However, the data can also
- be instructions for synthesizing a sound on the DSP. The codes are
- listed and explained in "Format Codes," below.
-
- samplingRate is the sampling rate (if the data is samples). Three
- sampling rates, represented as integer constants, are supported by
- the hardware:
-
- Constant Sampling Rate (samples/sec)
-
- SND_RATE_CODEC 8012.821 (CODEC input)
- SND_RATE_LOW 22050.0 (low sampling rate output)
- SND_RATE_HIGH 44100.0 (high sampling rate output)
-
- channelCount is the number of channels of sampled sound.
-
-
-
-
-
- info
-
- info is a NULL-terminated string that you can supply to provide a
- textual description of the sound. The size of the info field is set
- when the structure is created and thereafter can't be enlarged. It's
- at least four bytes long (even if it's unused).
-
-
-
-
-
- Format Codes
-
- A sound's format is represented as a positive 32-bit integer. NeXT
- reserves the integers 0 through 255; you can define your own format
- and represent it with an integer greater than 255. Most of the
- formats defined by NeXT describe the amplitude quantization of
- sampled sound data:
-
- Value Code Format
-
- 0 SND_FORMAT_UNSPECIFIED unspecified format
- 1 SND_FORMAT_MULAW_8 8-bit mu-law samples
- 2 SND_FORMAT_LINEAR_8 8-bit linear samples
- 3 SND_FORMAT_LINEAR_16 16-bit linear samples
- 4 SND_FORMAT_LINEAR_24 24-bit linear samples
- 5 SND_FORMAT_LINEAR_32 32-bit linear samples
- 6 SND_FORMAT_FLOAT floating-point samples
- 7 SND_FORMAT_DOUBLE double-precision float samples
- 8 SND_FORMAT_INDIRECT fragmented sampled data
- 9 SND_FORMAT_NESTED ?
- 10 SND_FORMAT_DSP_CORE DSP program
- 11 SND_FORMAT_DSP_DATA_8 8-bit fixed-point samples
- 12 SND_FORMAT_DSP_DATA_16 16-bit fixed-point samples
- 13 SND_FORMAT_DSP_DATA_24 24-bit fixed-point samples
- 14 SND_FORMAT_DSP_DATA_32 32-bit fixed-point samples
- 15 ?
- 16 SND_FORMAT_DISPLAY non-audio display data
- 17 SND_FORMAT_MULAW_SQUELCH ?
- 18 SND_FORMAT_EMPHASIZED 16-bit linear with emphasis
- 19 SND_FORMAT_COMPRESSED 16-bit linear with compression
- 20 SND_FORMAT_COMPRESSED_EMPHASIZED A combination of the two above
- 21 SND_FORMAT_DSP_COMMANDS Music Kit DSP commands
- 22 SND_FORMAT_DSP_COMMANDS_SAMPLES ?
-
-
- Most formats identify different sizes and types of
- sampled data. Some deserve special note:
-
-
- -- SND_FORMAT_DSP_CORE format contains data that represents a
- loadable DSP core program. Sounds in this format are required by the
- SNDBootDSP() and SNDRunDSP() functions. You create a
- SND_FORMAT_DSP_CORE sound by reading a DSP load file (extension
- ".lod") with the SNDReadDSPfile() function.
-
- -- SND_FORMAT_DSP_COMMANDS is used to distinguish sounds that
- contain DSP commands created by the Music Kit. Sounds in this format
- can only be created through the Music Kit's Orchestra class, but can
- be played back through the SNDStartPlaying() function.
-
- -- SND_FORMAT_DISPLAY format is used by the Sound Kit's
- SoundView class. Such sounds can't be played.
-
-
- -- SND_FORMAT_INDIRECT indicates data that has become
- fragmented, as described in a separate section, below.
-
-
- -- SND_FORMAT_UNSPECIFIED is used for unrecognized formats.
-
-
-
-
-
- Fragmented Sound Data
-
- Sound data is usually stored in a contiguous block of memory.
- However, when sampled sound data is edited (such that a portion of
- the sound is deleted or a portion inserted), the data may become
- discontiguous, or fragmented. Each fragment of data is given its own
- SNDSoundStruct header; thus, each fragment becomes a separate
- SNDSoundStruct structure. The addresses of these new structures are
- collected into a contiguous, NULL-terminated block; the dataLocation
- field of the original SNDSoundStruct is set to the address of this
- block, while the original format, sampling rate, and channel count
- are copied into the new SNDSoundStructs.
-
-
- Fragmentation serves one purpose: It avoids the high cost of moving
- data when the sound is edited. Playback of a fragmented sound is
- transparent-you never need to know whether the sound is fragmented
- before playing it. However, playback of a heavily fragmented sound
- is less efficient than that of a contiguous sound. The
- SNDCompactSamples() C function can be used to compact fragmented
- sound data.
-
- Sampled sound data is naturally unfragmented. A sound that's freshly
- recorded or retrieved from a soundfile, the Mach-O segment, or the
- pasteboard won't be fragmented. Keep in mind that only sampled data
- can become fragmented.
-
-
-
- _________________________
- >From mentor.cc.purdue.edu!purdue!decwrl!ucbvax!ziploc!eps Wed Apr 4
- 23:56:23 EST 1990
- Article 5779 of comp.sys.next:
- Path: mentor.cc.purdue.edu!purdue!decwrl!ucbvax!ziploc!eps
- >From: eps@toaster.SFSU.EDU (Eric P. Scott)
- Newsgroups: comp.sys.next
- Subject: Re: Format of NeXT sndfile headers?
- Message-ID: <445@toaster.SFSU.EDU>
- Date: 31 Mar 90 21:36:17 GMT
- References: <14978@phoenix.Princeton.EDU>
- Reply-To: eps@cs.SFSU.EDU (Eric P. Scott)
- Organization: San Francisco State University
- Lines: 42
-
- In article <14978@phoenix.Princeton.EDU>
- bskendig@phoenix.Princeton.EDU (Brian Kendig) writes:
- >I'd like to take a program I have that converts Macintosh sound
- files
- >to NeXT sndfiles and polish it up a bit to go the other direction as
- >well.
-
- Two people have already submitted programs that do this
- (Christopher Lane and Robert Hood); check the various
- NeXT archive sites.
-
- > Could someone please give me the format of a NeXT sndfile
- >header?
-
- "big-endian"
- 0 1 2 3
- +-------+-------+-------+-------+
- 0 | 0x2e | 0x73 | 0x6e | 0x64 | "magic" number
- +-------+-------+-------+-------+
- 4 | | data location
- +-------+-------+-------+-------+
- 8 | | data size
- +-------+-------+-------+-------+
- 12 | | data format (enum)
- +-------+-------+-------+-------+
- 16 | | sampling rate (int)
- +-------+-------+-------+-------+
- 20 | | channel count
- +-------+-------+-------+-------+
- 24 | | | | | (optional) info
- string
-
- 28 = minimum value for data location
-
- data format values can be found in /usr/include/sound/soundstruct.h
-
- Most common combinations:
-
- sampling channel data
- rate count format
- voice file 8012 1 1 = 8-bit mu-law
- system beep 22050 2 3 = 16-bit linear
- CD-quality 44100 2 3 = 16-bit linear
-
- ------------------------------------------------------------------------
- IFF/8SVX Format
- ---------------
-
- Newsgroups: alt.binaries.sounds.d,alt.sex.sounds
- Subject: Format of the IFF header (Amiga sounds)
- Message-ID: <2509@tardis.Tymnet.COM>
- From: jms@tardis.Tymnet.COM (Joe Smith)
- Date: 23 Oct 91 23:54:38 GMT
- Followup-To: alt.binaries.sounds.d
- Organization: BT North America (Tymnet)
-
- The first 12 bytes of an IFF file are used to distinguish between an Amiga
- picture (FORM-ILBM), an Amiga sound sample (FORM-8SVX), or other file
- conforming to the IFF specification. The middle 4 bytes is the count of
- bytes that follow the "FORM" and byte count longwords. (Numbers are stored
- in M68000 form, high order byte first.)
-
- ------------------------------------------
-
- FutureSound audio file, 15000 samples at 10.000KHz, file is 15048 bytes long.
-
- 0000: 464F524D 00003AC0 38535658 56484452 FORM..:.8SVXVHDR
- F O R M 15040 8 S V X V H D R
- 0010: 00000014 00003A98 00000000 00000000 ......:.........
- 20 15000 0 0
- 0020: 27100100 00010000 424F4459 00003A98 '.......BODY..:.
- 10000 1 0 1.0 B O D Y 15000
-
- 0000000..03 = "FORM", identifies this as an IFF format file.
- FORM+00..03 (ULONG) = number of bytes that follow. (Unsigned long int.)
- FORM+03..07 = "8SVX", identifies this as an 8-bit sampled voice.
-
- ????+00..03 = "VHDR", Voice8Header, describes the parameters for the BODY.
- VHDR+00..03 (ULONG) = number of bytes to follow.
- VHDR+04..07 (ULONG) = samples in the high octave 1-shot part.
- VHDR+08..0B (ULONG) = samples in the high octave repeat part.
- VHDR+0C..0F (ULONG) = samples per cycle in high octave (if repeating), else 0.
- VHDR+10..11 (UWORD) = samples per second. (Unsigned 16-bit quantity.)
- VHDR+12 (UBYTE) = number of octaves of waveforms in sample.
- VHDR+13 (UBYTE) = data compression (0=none, 1=Fibonacci-delta encoding).
- VHDR+14..17 (FIXED) = volume. (The number 65536 means 1.0 or full volume.)
-
- ????+00..03 = "BODY", identifies the start of the audio data.
- BODY+00..03 (ULONG) = number of bytes to follow.
- BODY+04..NNNNN = Data, signed bytes, from -128 to +127.
-
- 0030: 04030201 02030303 04050605 05060605
- 0040: 06080806 07060505 04020202 01FF0000
- 0050: 00000000 FF00FFFF FFFEFDFD FDFEFFFF
- 0060: FDFDFF00 00FFFFFF 00000000 00FFFF00
- 0070: 00000000 00FF0000 00FFFEFF 00000000
- 0080: 00010000 000101FF FF0000FE FEFFFFFE
- 0090: FDFDFEFD FDFFFFFC FDFEFDFD FEFFFEFE
- 00A0: FFFEFEFE FEFEFEFF FFFFFEFF 00FFFF01
-
- This small section of the audio sample shows the number ranging from -5 (0xFD)
- to +8 (0x08). Warning: Do not assume that the BODY starts 48 bytes into the
- file. In addition to "VHDR", chunks labeled "NAME", "AUTH", "ANNO", or
- "(c) " may be present, and may be in any order. You will have to check the
- byte count in each chunk to determine how many bytes to skip.
-
- ------------------------------------------------------------------------
- Playing sound on a PC
- ---------------------
-
- From: Eric A Rasmussen
-
- Any turbo PC (8088 at 8 Mhz or greater)/286/386/486/etc. can produce a quality
- playback of single channel 8 bit sounds on the internal (1 bit, 1 channel)
- speaker by utilizing Pulse-Width-Modulation, which toggles the speaker faster
- than it can physically move to simulate positions between fully on and fully
- off. There are several PD programs of this nature that I know of:
-
- REMAC - Plays MAC format sound files. Files on the Macintosh, at least the
- sound files that I've ripped apart, seem to contain 3 parts. The
- first two are info like what the file icon looks like and other
- header type info. The third part contains the raw sample data, and
- it is this portion of the file which is saved to a seperate file,
- often named with the .snd extension by PC users. Personally, I like
- to name the files .s1, .s2, .s3, or .s4 to indicate the sampling rate
- of the file. (-s# is how to specify the playback rate in REMAC.)
- REMAC provides playback rates of 5550hz, 7333hz, 11 khz, & 22 khz.
- REMAC2 - Same as REMAC, but sounds better on higher speed machines.
- REPLAY - Basically same as REMAC, but for playback of Atari ST sounds.
- Apparently, the Atari has two sound formats, one of which sounds like
- garbage if played by REMAC or REPLAY in the incorrect mode. The
- other file format works fine with REMAC and so appears to be 'normal'
- unsigned 8-bit data. REPLAY provides playback rates of 11.5 khz,
- 12.5 khz, 14 khz, 16 khz, 18.5 khz, 22khz, & 27 khz.
-
- These three programs are all by the same author, Richard E. Zobell who does
- not have an internet mail address to my knowledge, but does have a GEnie email
- address of R.ZOBELL.
-
- Additionally, there are various stand-alone demos which use the internal
- speaker, of which there is one called mushroom which plays a 30 second
- advertising jingle for magic mushroom room deoderizers which is pretty
- humerous. I've used this player to playback samples that I ripped out of the
- commercial game program Mean Streets, which uses something they call RealSound
- (tm) to playback digital samples on the internal speaker. (Of course, I only do
- this on my own system, and since I own the game, I see no problems with it.)
-
- For owners of 8 Mhz 286's and above, the option to play 4 channel 8 bit sounds
- (with decent quality) on the internal speaker is also a reality. Quite a
- number of PD programs exist to do this, including, but not limited to:
-
- ModEdit, ModPlay, ScreamTracker, STM, Star Trekker, Tetra, and probably a few
- more.
-
- All these programs basically make use of various sound formats used by the
- Amiga line of computers. These include .stm files, .mod files
- [a.k.a. mod. files], and .nst files [really the same hing]. Also,
- these programs pretty much all have the option to playback the
- sound to add-on hardware such as the SoundBlaster card, the Covox series of
- devices, and also to direct the data to either one or two (for stereo)
- parallel ports, which you could attach your own D/A's to. (From what I have
- seen, the Covox is basically an small amplified speaker with a D/A which plugs
- into the parallel port. This sounds very similiar to the Disney Sound System
- (DSS) which people have been talking about recently.)
-
- ------------------------------------------------------------------------
- The EA-IFF-85 documentation
- ---------------------------
-
- From: dgc3@midway.uchicago.edu
-
- As promised, here's an ftp location for the EA-IFF-85 documentation. It's
- the November 1988 release as revised by Commodore (the last public release),
- with specifications for IFF FORMs for graphics, sound, formatted text, and
- more. IFF FORMS now exist for other media, including structured drawing, and
- new documentation is now available only from Commodore.
-
- The documentation is at grind.isca.uiowa.edu [128.255.19.233], in the
- directory /amiga/f1/ff185. The complete file list is as follows:
-
- DOCUMENTS.zoo
- EXAMPLES.zoo
- EXECUTABLE.zoo
- INCLUDE.zoo
- LINKER_INFO.zoo
- OBJECT.zoo
- SOURCE.zoo
- TP_IFF_Specs.zoo
-
- All files except DOCUMENTS.zoo are Amiga-specific, but may be used as a basis
- for conversion to other platforms. Well, I take that tentatively back. I
- don't know what TP_IFF_Specs.zoo contains, so it might be non-Amiga-specific.
-
- ------------------------------------------------------------------------
- US Federal Standard 1016 availability
- -------------------------------------
-
- From: Joe Campbell N3JBC jpcampb@afterlife.ncsc.mil 74040.305@compuserve.com
-
- The U.S. DoD's Federal-Standard-1016 4800 bps code excited linear prediction
- voice coder version 3.2 (CELP 3.2) Fortran and C simulation source codes are
- now available for worldwide distribution at no charge (on DOS diskettes,
- but configured to compile on Sun SPARC stations) from:
-
- Bob Fenichel
- National Communications System
- Washington, D.C. 20305
- 1-703-692-2124
- 1-703-746-4960 (fax)
-
- In addition to the source codes, example input and processed speech files
- are included along with a technical information bulletin to assist in
- implementation of FS-1016 CELP. (An anonymous ftp site is being considered
- for future releases.)
-
- Copies of the FS-1016 document are available for $2.50 each from:
-
- GSA Rm 6654
- 7th & D St SW
- Washington, D.C. 20407
- 1-202-708-9205
-
- The following articles describe the Federal-Standard-1016 4.8-kbps CELP
- coder (it's unnecessary to read more than one):
-
- Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch,
- "The Federal Standard 1016 4800 bps CELP Voice Coder," Digital Signal
- Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155.
-
- Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch,
- "The DoD 4.8 kbps Standard (Proposed Federal Standard 1016),"
- in Advances in Speech Coding, ed. Atal, Cuperman and Gersho,
- Kluwer Academic Publishers, 1991, Chapter 12, p. 121-133.
-
- Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, "The
- Proposed Federal Standard 1016 4800 bps Voice Coder: CELP," Speech
- Technology Magazine, April/May 1990, p. 58-64.
-
- For U.S. FED-STD-1016 (4800 bps CELP) _realtime_ DSP code
- and information about products using this code, contact:
-
- John DellaMorte
- DSP Software Engineering
- 165 Middlesex Tpk, Suite 206
- Bedford, MA 01730
- 1-617-275-3733
- 1-617-275-4323 (fax)
- dspse.bedford@channel1.com
-
- DSP Software Engineering's code can run on a DSP Research's Tiger 30 board
- (a PC board with a TMS320C3x and analog interface suited to development work)
- or on Intellibit's AE2000 TMS320C31 based 3" by 2.5" card.
-
- DSP Research Intellibit
- 1095 E. Duane Ave. P.O. Box 9785
- Sunnyvale, CA 94086 McLean, VA 22102-0785
- (408)773-1042 (703)442-4781
- (408)736-3451 (fax) (703)442-4784 (fax)
-
- From: tobiasr@monolith.lrmsc.loral.com (Richard Tobias )
-
- For U.S. FED-STD-1016 (4800 bps CELP) _realtime_ DSP code and
- information about products using this code using the AT&T DSP32C and
- AT&T DSP3210, contact:
-
- White Eagle Systems Technology, Inc.
- 1123 Queensbridge Way
- San Jose, CA 95120
- (408) 997-2706
- (408) 997-3584 (fax)
- rjjt@netcom.com
-
- ------------------------------------------------------------------------
- Creative Voice (VOC) file format
- --------------------------------
-
- From: galt@dsd.es.com
-
- (byte numbers are hex!)
-
- HEADER (bytes 00-19)
- Series of DATA BLOCKS (bytes 1A+) [Must end w/ Terminator Block]
-
- - ---------------------------------------------------------------
-
- HEADER:
- =======
- byte # Description
- ------ ------------------------------------------
- 00-12 "Creative Voice File"
- 13 1A (eof to abort printing of file)
- 14-15 Offset of first datablock in .voc file (std 1A 00
- in Intel Notation)
- 16-17 Version number (minor,major) (VOC-HDR puts 0A 01)
- 18-19 2's Comp of Ver. # + 1234h (VOC-HDR puts 29 11)
-
- - ---------------------------------------------------------------
-
- DATA BLOCK:
- ===========
-
- Data Block: TYPE(1-byte), SIZE(3-bytes), INFO(0+ bytes)
- NOTE: Terminator Block is an exception -- it has only the TYPE byte.
-
- TYPE Description Size (3-byte int) Info
- ---- ----------- ----------------- -----------------------
- 00 Terminator (NONE) (NONE)
- 01 Sound data 2+length of data *
- 02 Sound continue length of data Voice Data
- 03 Silence 3 **
- 04 Marker 2 Marker# (2 bytes)
- 05 ASCII length of string null terminated string
- 06 Repeat 2 Count# (2 bytes)
- 07 End repeat 0 (NONE)
-
- *Sound Info Format: **Silence Info Format:
- --------------------- ----------------------------
- 00 Sample Rate 00-01 Length of silence - 1
- 01 Compression Type 02 Sample Rate
- 02+ Voice Data
-
-
- Marker# -- Driver keeps the most recent marker in a status byte
- Count# -- Number of repetitions + 1
- Count# may be 1 to FFFE for 0 - FFFD repetitions
- or FFFF for endless repetitions
- Sample Rate -- SR byte = 256-(1000000/sample_rate)
- Length of silence -- in units of sampling cycle
- Compression Type -- of voice data
- 8-bits = 0
- 4-bits = 1
- 2.6-bits = 2
- 2-bits = 3
- Multi DAC = 3+(# of channels) [interesting--
- this isn't in the developer's manual]
-
- ------------------------------------------------------------------------
- RIFF WAVE (.WAV) file format
- ----------------------------
-
- RIFF is a format by Microsoft and IBM which is similar in spirit and
- functionality as EA-IFF-85, but not compatible (and it's in
- little-endian byte order, of course :-). WAVE is RIFF's equivalent of
- AIFF, and its inclusion in Microsoft Windows 3.1 has suddenly made it
- important to know about.
-
- Rob Ryan was kind enough to send me a description of the RIFF format.
- Unfortunately, it is too big to include here (27 k), but I've made it
- available for anonymous ftp as ftp.cwi.nl:/pub/RIFF-format.
-
- And here's a pointer to the official description from Matt Saettler,
- Microsoft Multimedia:
-
- "The complete definition of the WAVE file format as defined by
- IBM/Microsoft is available for anon. FTP from ftp.uu.net in the
- vendor/microsoft/multimedia directory."
-
- (Rob Ryan's version may actually be an extract from one of the files
- stored there.)
-
- ------------------------------------------------------------------------
- U-LAW and A-LAW definitions
- ---------------------------
-
- [Adapted from information provided by duggan@cc.gatech.edu (Rick
- Duggan) and davep@zenobia.phys.unsw.EDU.AU (David Perry)]
-
- u-LAW (really mu-LAW) is
-
- sgn(m) ( |m |) |m |
- y= ------- ln( 1+ u|--|) |--| =< 1
- ln(1+u) ( |mp|) |mp|
-
- A-LAW is
-
- | A (m ) |m | 1
- | ------- (--) |--| =< -
- | 1+ln A (mp) |mp| A
- y=|
- | sgn(m) ( |m |) 1 |m |
- | ------ ( 1+ ln A|--|) - =< |--| =< 1
- | 1+ln A ( |mp|) A |mp|
-
- Values of u=100 and 255, A=87.6, mp is the Peak message value, m is
- the current quantised message value. (The formulae get simpler if you
- substitute x for m/mp and sgn(x) for sgn(m); then -1 <= x <= 1.)
-
- Converting from u-LAW to A-LAW is in a sense "lossy" since there are
- quantizing errors introduced in the conversion.
-
- "..the u-LAW used in North America and Japan, and the
- A-LAW used in Europe and the rest of the world and
- international routes.."
-
- References:
-
- Modern Digital and Analog Communication Systems, B.P.Lathi., 2nd ed.
- ISBN 0-03-027933-X
-
- Transmission Systems for Communications
- Fifth Edition
- by Members of the Technical Staff at Bell Telephone Laboratories
- Bell Telephone Laboratories, Incorporated
- Copyright 1959, 1964, 1970, 1982
-
- ------------------------------------------------------------------------
- AVR File Format
- ---------------
-
- From: hyc@hanauma.Jpl.Nasa.Gov (Howard Chu)
-
- A lot of PD software exists to play Mac .snd files on the ST. One other
- format that seems pretty popular (used by a number of commercial packages)
- is the AVR format (from Audio Visual Research). This format has a 128 byte
- header that looks like this:
-
- char magic[4]="2BIT";
- char name[8]; /* null-padded sample name */
- short mono; /* 0 = mono, 0xffff = stereo */
- short rez; /* 8 = 8 bit, 16 = 16 bit */
- short sign; /* 0 = unsigned, 0xffff = signed */
- short loop; /* 0 = no loop, 0xffff = looping sample */
- short midi; /* 0xffff = no MIDI note assigned,
- 0xffXX = single key note assignment
- 0xLLHH = key split, low/hi note */
- long rate; /* sample frequency in hertz */
- long size; /* sample length in bytes or words (see rez) */
- long lbeg; /* offset to start of loop in bytes or words.
- set to zero if unused. */
- long lend; /* offset to end of loop in bytes or words.
- set to sample length if unused. */
- short res1; /* Reserved, MIDI keyboard split */
- short res2; /* Reserved, sample compression */
- short res3; /* Reserved */
- char ext[20]; /* Additional filename space, used
- if (name[7] != 0) */
- char user[64]; /* User defined. Typically ASCII message. */
- ------------------------------------------------------------------------
-